6 Spatial Data and Maps
We’ll explore the basics of simple features (sf) for building spatial datasets, then some common mapping methods, probably:
- ggplot2
- tmap
6.1 Spatial Data
To work with spatial data requires extending R to deal with it using packages. Many have been developed, but the field is starting to mature using international open GIS standards.
sp (until recently, the dominant library of spatial tools)
- Includes functions for working with spatial data
- Includes
spplotto create maps - Also needs
rgdalpackage forreadOGR– reads spatial data frames.
sf (Simple Features)
- ISO 19125 standard for GIS geometries
- Also has functions for working with spatial data, but clearer to use.
- Doesn’t need many additional packages, though you may still need
rgdalinstalled for some tools you want to use. - Replacing
spandspplotthough you’ll still find them in code. We’ll give it a try… - Works with ggplot2 and tmap for nice looking maps.
Cheat sheet: https://github.com/rstudio/cheatsheets/raw/master/sf.pdf
6.1.0.1 simple feature geometry sfg and simple feature column sfc
6.1.1 Examples of simple geometry building in sf
sf functions have the pattern st_*
st means “space and time”
See Geocomputation with R at https://geocompr.robinlovelace.net/ or https://r-spatial.github.io/sf/ for more details, but here’s an example of manual feature creation of sf geometries (sfg):
library(tidyverse)
library(sf)
library(iGIScData)[As usual, go to the relevant project, in this case generic_methods]
library(sf)
eyes <- st_multipoint(rbind(c(1,5), c(3,5)))
nose <- st_point(c(2,4))
mouth <- st_linestring(rbind(c(1,3),c(3, 3)))
border <- st_polygon(list(rbind(c(0,5), c(1,2), c(2,1), c(3,2),
c(4,5), c(3,7), c(1,7), c(0,5))))
face <- st_sfc(eyes, nose, mouth, border) # sfc = sf column
plot(face)
Figure 6.1: Building simple geometries in sf
The face was a simple feature column (sfc) built from the list of sfgs. An sfc just has the one column, so is not quite like a shapefile.
- But it can have a coordinate referencing system CRS, and so can be mapped.
- Kind of like a shapefile with no other attributes than shape
[westUS]
6.1.2 Building a mappable sfc from scratch
CA_matrix <- rbind(c(-124,42),c(-120,42),c(-120,39),c(-114.5,35),
c(-114.1,34.3),c(-114.6,32.7),c(-117,32.5),c(-118.5,34),c(-120.5,34.5),
c(-122,36.5),c(-121.8,36.8),c(-122,37),c(-122.4,37.3),c(-122.5,37.8),
c(-123,38),c(-123.7,39),c(-124,40),c(-124.4,40.5),c(-124,41),c(-124,42))
NV_matrix <- rbind(c(-120,42),c(-114,42),c(-114,36),c(-114.5,36),
c(-114.5,35),c(-120,39),c(-120,42))
CA_list <- list(CA_matrix); NV_list <- list(NV_matrix)
CA_poly <- st_polygon(CA_list); NV_poly <- st_polygon(NV_list)
sfc_2states <- st_sfc(CA_poly,NV_poly,crs=4326) # crs=4326 specifies GCS
st_geometry_type(sfc_2states)## [1] POLYGON POLYGON
## 18 Levels: GEOMETRY POINT LINESTRING POLYGON MULTIPOINT MULTILINESTRING MULTIPOLYGON GEOMETRYCOLLECTION CIRCULARSTRING ... TRIANGLE
library(tidyverse)
ggplot() + geom_sf(data = sfc_2states)
Figure 6.2: A simple map built from scratch with hard-coded data as simple feature columns
sf class
Is like a shapefile: has attributes to which geometry is added, and can be used like a data frame.
attributes <- bind_rows(c(abb="CA", area=423970, pop=39.56e6),
c(abb="NV", area=286382, pop=3.03e6))
twostates <- st_sf(attributes, geometry = sfc_2states)
ggplot(twostates) + geom_sf() + geom_sf_text(aes(label = abb))## Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not give correct results for longitude/latitude data
Figure 6.3: Using an sf class to build a map, displaying an attribute
6.1.3 Creating features from shapefiles or tables
sf’s st_read reads shapefiles
- shapefile is an open GIS format for points, polylines, polygons
You would normally have shapefiles (and all the files that go with them – .shx, etc.)
stored on your computer, but we’ll access one from the iGIScData external data folder [sierra]:
library(iGIScData)
library(sf)
shpPath <- system.file("extdata","CA_counties.shp", package="iGIScData")
CA_counties <- st_read(shpPath)## Reading layer `CA_counties' from data source `C:\Users\900008452\Documents\R\win-library\4.0\iGIScData\extdata\CA_counties.shp' using driver `ESRI Shapefile'
## Simple feature collection with 58 features and 60 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -124.4152 ymin: 32.53427 xmax: -114.1312 ymax: 42.00952
## Geodetic CRS: WGS 84
plot(CA_counties)## Warning: plotting the first 9 out of 60 attributes; use max.plot = 60 to plot all

st_as_sf converts data frames
- using coordinates read from x and y variables, with crs set to coordinate system (4326 for GCS)
sierraFebpts <- st_as_sf(sierraFeb, coords = c("LONGITUDE", "LATITUDE"), crs=4326)
plot(sierraFebpts)
[air_quality]
library(tidyverse)
library(sf)
library(iGIScData)
censusCentroids <- st_centroid(BayAreaTracts)
TRI_sp <- st_as_sf(TRI_2017_CA, coords = c("LONGITUDE", "LATITUDE"),
crs=4326) # simple way to specify coordinate reference
bnd <- st_bbox(censusCentroids)
ggplot() +
geom_sf(data = BayAreaCounties, aes(fill = NAME)) +
geom_sf(data = censusCentroids) +
geom_sf(data = CAfreeways, color = "grey") +
geom_sf(data = TRI_sp, color = "yellow") +
coord_sf(xlim = c(bnd[1], bnd[3]), ylim = c(bnd[2], bnd[4])) +
labs(title="Bay Area Counties, Freeways and Census Tract Centroids")
Figure 6.4: ggplot map of Bay Area TRI sites, census centroids, freeways
6.1.4 Coordinate Referencing System
Say you have data you need to make spatial with a spatial reference
sierra <- read_csv("sierraClimate.csv")
EPSG or CRS codes are an easy way to provide coordinate referencing.
Two ways of doing the same thing.
- Spell it out:
GCS <- "+proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0"
wsta = st_as_sf(sierra, coords = c("LONGITUDE","LATITUDE"), crs=GCS)
- Google to find the code you need and assign it to the crs parameter:
wsta <- st_as_sf(sierra, coords = c("LONGITUDE","LATITUDE"), crs=4326)
6.1.4.1 Removing Geometry
There are many instances where you want to remove geometry from a sf data frame
Some R functions run into problems with geometry and produce confusing error messages, like “non-numeric argument”
You’re wanting to work with an sf data frame in a non-spatial way
One way to remove geometry:
myNonSFdf <- mySFdf %>% st_set_geometry(NULL)
6.1.5 Spatial join st_join
A spatial join with st_join
joins data from census where TRI points occur [air_quality]
TRI_sp <- st_as_sf(TRI_2017_CA, coords = c("LONGITUDE", "LATITUDE"), crs=4326) %>%
st_join(BayAreaTracts) %>%
filter(CNTY_FIPS %in% c("013", "095"))6.1.6 Plotting maps in the base plot system
There are various programs for creating maps from spatial data, and we’ll look at a few after we’ve looked at rasters. As usual, the base plot system often does something useful when you give it data.
plot(BayAreaCounties)## Warning: plotting the first 9 out of 60 attributes; use max.plot = 60 to plot all

And with just one variable:
plot(BayAreaCounties["POP_SQMI"])
There’s a lot more we could do with the base plot system, but we’ll mostly focus on some better options in ggplot2 and tmap.
6.2 Raster GIS in R
Simple Features are feature-based, so based on the name I guess it’s not surprising that sf doesn’t have support for rasters. But we can use the raster package for that.
A bit of raster reading and map algebra with Marble Mountains elevation data [marbles]
library(raster)
rasPath <- system.file("extdata","elev.tif", package="iGIScData")
elev <- raster(rasPath)
slope <- terrain(elev, opt="slope")
aspect <- terrain(elev, opt="aspect")
slopeclasses <-matrix(c(0,0.2,1, 0.2,0.4,2, 0.4,0.6,3,
0.6,0.8,4, 0.8,1,5), ncol=3, byrow=TRUE)
slopeclass <- reclassify(slope, rcl = slopeclasses)
plot(elev)
plot(slope)
plot(slopeclass)
plot(aspect)
6.2.1 Raster from scratch
new_raster2 <- raster(nrows = 6, ncols = 6, res = 0.5,
xmn = -1.5, xmx = 1.5, ymn = -1.5, ymx = 1.5,
vals = 1:36)
plot(new_raster2)
6.3 ggplot2 for maps
The Grammar of Graphics is the gg of ggplot.
- Key concept is separating aesthetics from data
- Aesthetics can come from variables (using aes()setting) or be constant for the graph
Mapping tools that follow this lead
- ggplot, as we have seen, and it continues to be enhanced
- tmap (Thematic Maps) https://github.com/mtennekes/tmap Tennekes, M., 2018, tmap: Thematic Maps in R, Journal of Statistical Software 84(6), 1-39
ggplot(CA_counties) + geom_sf()
Try ?geom_sf and you’ll find that its first parameters is mapping with aes() by default. The data property is inherited from the ggplot call, but commonly you’ll want to specify data=something in your geom_sf call.
Another simple ggplot, with labels
ggplot(CA_counties) + geom_sf() +
geom_sf_text(aes(label = NAME), size = 1.5)## Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not give correct results for longitude/latitude data

and now with fill color
ggplot(CA_counties) + geom_sf(aes(fill = MED_AGE)) +
geom_sf_text(aes(label = NAME), col="white", size=1.5)## Warning in st_point_on_surface.sfc(sf::st_zm(x)): st_point_on_surface may not give correct results for longitude/latitude data

Repositioned legend, no “x” or “y” labels
ggplot(CA_counties) + geom_sf(aes(fill=MED_AGE)) +
geom_sf_text(aes(label = NAME), col="white", size=1.5) +
theme(legend.position = c(0.8, 0.8)) +
labs(x="",y="")
Map in ggplot2, zoomed into two counties [air_quality]:
library(tidyverse); library(sf); library(iGIScData)
census <- BayAreaTracts %>%
filter(CNTY_FIPS %in% c("013", "095"))
TRI <- TRI_2017_CA %>%
st_as_sf(coords = c("LONGITUDE", "LATITUDE"), crs=4326) %>%
st_join(census) %>%
filter(CNTY_FIPS %in% c("013", "095"),
(`5.1_FUGITIVE_AIR` + `5.2_STACK_AIR`) > 0)## although coordinates are longitude/latitude, st_intersects assumes that they are planar
## although coordinates are longitude/latitude, st_intersects assumes that they are planar
bnd = st_bbox(census)
ggplot() +
geom_sf(data = BayAreaCounties, aes(fill = NAME)) +
geom_sf(data = census, color="grey40", fill = NA) +
geom_sf(data = TRI) +
coord_sf(xlim = c(bnd[1], bnd[3]), ylim = c(bnd[2], bnd[4])) +
labs(title="Census Tracts and TRI air-release sites") +
theme(legend.position = "none")
6.3.1 Rasters in ggplot2
Raster display in ggplot2 is currently a little awkward, as are rasters in general in the feature-dominated GIS world.
We can use a trick: converting rasters to a grid of points [marbles:
library(tidyverse)
library(sf)
library(raster)
rasPath <- system.file("extdata","elev.tif", package="iGIScData")
elev <- raster(rasPath)
shpPath <- system.file("extdata","trails.shp", package="iGIScData")
trails <- st_read(shpPath)## Reading layer `trails' from data source `C:\Users\900008452\Documents\R\win-library\4.0\iGIScData\extdata\trails.shp' using driver `ESRI Shapefile'
## Simple feature collection with 32 features and 8 fields
## Geometry type: LINESTRING
## Dimension: XY
## Bounding box: xmin: 481903.8 ymin: 4599196 xmax: 486901.9 ymax: 4603200
## Projected CRS: NAD83 / UTM zone 10N
elevpts = as.data.frame(rasterToPoints(elev))
ggplot() +
geom_raster(data = elevpts, aes(x = x, y = y, fill = elev)) +
geom_sf(data = trails)
6.4 tmap
Basic building block is tm_shape(data) followed by various layer elements such as tm_fill() shape can be features or raster See Geocomputation with R Chapter 8 “Making Maps with R” for more information. https://geocompr.robinlovelace.net/adv-map.html
library(spData)
library(tmap)
tm_shape(world) + tm_fill() + tm_borders()Color by variable [air_quality]
library(sf)
library(tmap)
tm_shape(BayAreaTracts) + tm_fill(col = "MED_AGE")